Thermally activated processes in polymer dynamics 



Lorenzo Bongini' 1 ', Roberto Livi^M 1 ), Antonio Politi' 3 '^ 1 ), and Alessandro Torcini^'W 
(1) 1NFM, UdR Firenze, via Sansone, 1 - 1-50019 Sesto Fiorentino, Italy 
(2) Dipartimento di Fisica, Universitd di Firenze, via Sansone, 1 - 1-50019 Sesto Fiorentino, Italy 
(3) Istituto Nazionale di Ottica Applicata, L.go E. Fermi, 6 - 1-50125 Firenze, Italy 
1 (Dated: February 6, 2008) 

o ; 

Jumps between neighboring minima in the energy landscape of both homopolymeric and heteropolymeric 
, chains are numerically investigated by determining the average escape time from different valleys. The numer- 

ical results are compared to the theoretical expression derived by Langer [J.S. Langer, Ann. Phys. 54 (1969) 
258] with reference to a 2iV-dimensional space. Our simulations indicate that the dynamics within the native 
valley is well described by a sequence of thermally activated process up to temperatures well above the folding 
£f>l . temperature. At larger temperatures, systematic deviations from the Langer's estimate are instead observed. 

' Several sources for such discrepancies are thoroughly discussed. 
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I. INTRODUCTION 



Polymeric chains exhibit quite a rich variety of dynamical properties. At high temperatures, kinetic energy is large enough 
to allow a chain exploring most of the accessible phase-space. In this regime, the polymer typically assumes a "random coil" 
structure. At intermediate temperatures, internal forces and the interaction with the solvent become strong enough to stabilize 
compact configurations Qj. However, kinetic energy fluctuations are still able to drive the chain from one to another minimum 
of the energy landscape. The properties of this itinerant dynamics depend on several factors: the height of the barriers separating 
neighboring minima, their accessiblity and, more generally, the overall structure of the energy landscape. Upon further decreas- 
ing the temperature, an heteropolymer typically undergoes a glass transition and may freeze in one of several distinct free-energy 
minima. Only some peculiar heteropolymers exhibit a transition to a "folding regime", i.e. are characterized by a relatively fast 
convergence towards the absolute energy minimum, irrespectively of the initial state. In this case, the heteropolymer is said to 
be a "good folder" and it can be viewed as a specimen of a protein, that always evolves to its native configuration (NC) Q]. 

Independently whether a given polymer is homogeneous or heterogeneous, whether it is a good or a bad folder, a complete 
understanding of its dynamical properties passes through the description of the jump processes between different energy valleys 
0,0]. Free-energy valleys are indeed collections of distinct minima and studying the connectivity of such minima can help 
identifying and parametrizing the relevant macroscopic states |5l|6J]. In the hope to eventually make substantial progress along 
this line, in this paper we aim at testing the validity of the expressions utilized to characterize the single escape processes. In the 
current literature, the escape is often viewed as an activation process and Kramers-like formulae, derived for low-dimensional 
systems, are applied to characterize high-dimensional systems without testing their validity. In this paper we present a detailed 
check of the formula derived by Langer in 1969 |7], finding that the escape process is strongly influenced by the entropic 
• contribution associated with the local geometry of the energy landscape. 

In Sec. 2 we introduce the polymer model used as a testing ground for the numerical analysis of activation processes in 
relatively high-dimensional systems |8]. It consists of a chain of two types of beads embedded in a two-dimensional space. In 
the same section, we briefly recall the relevant properties of both homogeneous and heterogeneous systems upon varying the 
temperature. In Sec. 3, the general theoretical ideas lying behind the derivation of Langer's formula are briefly summarized. 
The technical difficulties associated with the determination of geometrical factors are also discussed together with some possible 
approximation schemes. In Section 4, theoretical predictions are compared with numerical simulations for specimens of bad and 
good folders. In spite of an overall qualitative agreement, systematic deviations are found at relatively high temperatures, the 
origin of which is discussed in Sec. 5, where several effects are separately discussed. Finally, in Sec. 6, the main conclusions 
are summarized and the open problems briefly recalled. 



II. THE MODEL: DEFINITIONS AND THERMODYNAMIC AL PROPERTIES 



In this paper we study the escape process from an energy valley with reference to a model thoroughly investigated in 01, 
where the authors slightly modified a previous version, originally introduced in |8]. The model, designed to simulate sequences 
of amino acids interacting within a solvent, describes a chain of L monomers embedded in a 2-dimensional space. At variance 
with 1 8], where monomers were rigidly linked along the backbone, in 0] a nearest-neighbor harmonic potential was instead 
assumed, 



Vi(n. l+ i) = a(r Vi+ i - r ) 2 , 



(1) 
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where nj — \/ (xi — Xj) 2 + (jji — yj) 2 is the distance between the ith and the j-th monomer, while Xi, yi are the coordinates 
of the ith monomer. Without loss of generality, the equilibrium distance r$ is set equal to 1, while the interaction constant a has 
been fixed equal to 20 so as to induce an almost rigid interaction between neighbouring monomers 

The second term expresses the energy cost of local bending; it is described by the three-body interaction term 

«<*) = m 

where 9i is the angle formed between the links connecting the (t — l)-st, the i-th and the (i + l)-st monomers. In particular, 

(xi - Xi-x){x i+ x -Xi) + (yi - - 
cos Qi = , (3) 

where —it < 6i < ir. 

Finally, heterogeneity is ensured by Lennard-Jones type interaction between non-neighbouring monomers (|i — j\ > 1) 

%(ru) = 4r-¥ (4) 

«,i hi 



where 



= ~(2-3&-3^- + 5^-) 



and = indicates that the ith monomer is hydrophobic (H), while = 1 corresponds to a polar (P) one. As a result, the 
interaction is attractive if the two monomers are either both hydrophobic or both polar (cjj = 1 and 1/2, respectively), while 
it is repulsive, if the monomers belong to different species (in which case cy = —1/2). This potential choice simulates the 
effective interaction among H and P monomers in the presence of a solvent. In fact, since H monomers prefer to avoid a direct 
contact with the solvent, they tend to clusterize in the interior where they can be shielded from water by a shell of P monomers. 
The net result is an effective H-H attraction and an H-P repulsion as assumed in the model. 
Altogeher, the heteropolymer Hamiltonian writes as 

H = E Pl l 2m P2y ' 1 + £ + E V ^ + E E Unj^i,^) (5) 

i=l i=l i=2 i=l j=i+2 

where all monomers are assumed to have the same mass m and momenta are defined as (p x ,i,Py,i) ■— m(xi, yi). 

Accordingly, each heteropolymer is perfectly identified by a binary sequence of 0s and Is specifying the nature of each 
monomer. Those sequences for which the heteropolymer shape converges systematically (at intermediate temperatures) towards 
the same "native" configuration independently of the initial condition are identified as "good folders". Previous studies, mostly 
based on Monte Carlo techniques indicate that this happens only in a few cases f!llfT2lfl3ll and the scenario has been confirmed 
also by molecular dynamics simulations |9]. 

In what follows, we shall limit our investigations to the three following cases, all of length L = 20, 

• [S0]= [0000 0000 0000 0000 0000] a homopolymer composed of hydrophobic residues only; 

• [Sl]=[0001 0001 0001 1001 1000] a sequence first studied in ||12] (therein indicated with the code number 81) where it 
was identified as a good folder; 

S4]=[1110 0100 0000 0001 0010] a randomly generated sequence with 6 P-type residues, identified as a bad folder in 



A reasonably accurate characterization of each sequence can be obtained by determining three transition temperatures. The 
first one, T$, denotes the temperature below which the polymer is in a collapsed rather than in a random-coil configuration 1 1]. 
It can be determined by studying the temperature dependence of the gyration radius R gy (T): Tq corresponds to the maximum 
of dR gy (T)/dT. 

The folding temperature Tf is the temperature below which the heteropolymer stays predominantly in the native valley. Here, 
analogously to 1 9], we define the native valley as the basins of attraction of the NC and of its neighbouring minima. A quantitative 
estimate of Tj can be then obtained by determining the temperature at which the chain spends 50% of the time within the native 
valley. 
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FIG. 1 : Configurations of all the NNMs for the three considered sequences : SO (a), S 1 (b) and S4 (c). 



Finally, the glass-transition temperature T g can be identified by comparing (finite) time averages performed starting from 
different initial conditions. Specifically, we have considered unfolding (USs) and folding (FSs) simulations, whose initial con- 
ditions correspond to the NC and to random coil configurations, respectively. In practice, T g is defined as the temperature below 
which the relative difference between USs and FSs averages of the internal energy U is larger than 10%. 

We have determined Tg, Tf, and T g , by means of molecular-dynamics simulations with each monomer being in contact with 
a stochastic thermal reservoir at temperature T 

OH 

Zi(t)=Pz,i/m ; p zi (t) = -- jp z .i(t) +Vz,i( t ) (6) 

OZi 

Here z% is introduced as a shorthand notation for both the spatial coordinates x% and j/,, 7 is the dissipation rate and rj Zt i(t) is a 
Gaussian distributed, (5-correlated random noise 

(»7.,i(*)»7*j(0)) = 2jmk B T6(t)S id , (7) 

where, k B denotes the Boltzmann constant and T is the temperature (for the sake of simplicity, both k B and m have been set to 
unit). 

The temperature values obtained for 5*0 — 51 — SA are reported in Tab.||J together with the number no of the minima directly 
connected with the NC. The glassy transition has been determined by performing averages over a time lapse on the order of 10 6 
units. These results are very close to those reported in [9], where a deterministic Nose-Hoover thermostatting scheme fl4ll was 
used instead. The advantage of using the Langevin equation (|6} is that the damping rate can be directly controlled. As we shall 
see in the next section, this is a crucial ingredient for characterizing the escape rate from a given valley. 

A more detailed characterization of heteropolymer dynamics can be obtained by identifying at least the most visited minima 
of the potential energy V = Vi + V2 + V3. Here, we have proceeded by sampling a generic trajectory at time intervals of length 
At ~ 1 — 5. Then, the resulting instantaneous configurations have been taken as initial conditions for the overdamped dynamics 



dH 

dzi 



(8) 



TABLE I: The collapse-transition temperature Tg, the glassy temperature T g , the folding temperature Tf, the number no of neaerst neighbour 
minima of the NC for the sequences SO (homopolymer), SI (good folder) and S4 (bad folder) 





SO 


SI 


S4 


Tg 


0.16 


0.11 


0.13 


T g 


0.022 


0.048 


0.025 


Tf 


0.044 


0.061 


0.044 


n 


31 


37 


36 
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FIG. 2: Potential energy versus time during a folding simulation for the sequence SI at T = 0.055 (grey circles). The solid line is a local 
average with exponentially increasing window size. 

which drives the system to the minimum energy state, whose basin of attraction contains the initial condition 0], The minima 
separated from the absolute minimum (the NC) by a single energy barrier have been denoted as nearest neighbouring minima 
(NNM), while those separated from the NC by two barriers as second-nearest neighbouring minima (2NNM), and so on. The 
NNM configurations for all the three examined sequences are reported in Fig.^ Before passing to a specific discussion of the 
escape from a given valley, it is convenient to illustrate the outcome of a typical FS in the temperature range T g < T < Tg. The 
evolution of the difference AV between the instantaneous potential energy and the potential energy Vq of the NC is reported 
in Fig.|2]for the heteropolymer SI. A series of sudden conformational changes are clearly identifiable from the various energy 
drops (notice the logarithmic scale of both axes). Once SI enters the native valley, it remains there for a very long time, although 
jumps towards neighbouring minima can occasionally occur. 

III. ESCAPE RATE FROM A METASTABLE STATE 

Since the publication of the pioneering paper of Kramers til l, the problem of determining the escape rate from a metastable 
state has been addressed in many different contexts. Here, we consider the overdamped dynamics of an ensemble of N interacting 
particles in a 2d environment. We restrict our discussion to the overdamped limit, since it is expected that in the protein folding 
problem, the time scale of energy exchanges with the thermal bath (through collisions with water molecules) is rather fast (we 
shall anyway return to this point later on). The probability density P(r, t) for a configuration to be in an infinitesimal volume 
around the state r = (rj., . . . , r2jv) = (x\ , yi, £2, • • • , xn ,Un) at time t satisfies the Fokker-Planck equation 
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where V(r) represents the energy potential. The above is nothing but a continuity equation, with the r.h.s. representing the 
divergence of the probability flux 



Ji 



1 

7m 



dV{r) 



dr.; 



kBT-^- 



P(v,t) 



k B T 



■ exp 



-V(r) 
knT 



d_ 

dn 



exp 



P{r,t) 



(10) 



The stationary solution (with no flux boundary conditions, Ji = 0) is simply given by P(r) — exp[V(r) / (ksT)]. Let us now 
assume that the energy landscape exhibits at least two local minima m a and nib with energies respectively equal to V a and V&: 
we want to estimate the escape rate from the basin of attraction of m a . The boundary separating the basins of attractions of the 
two minima coincides with the stable manifold of possibly more than one saddle point. Let us denote the energy on the saddle 
with r s is V s . If the system is prepared into the state m a , a flux J sets in: if the flux itself is weak, it is basically constant in 
time and one can approach the problem by determining the stationary state with J being a solenoidal field. Once J is given, 
the escape rate Y can be obtained by integrating the outgoing flux over the whole boundary of the well. We are not aware of a 
general solution in more than one dimension. 

The most general case amenable to an analytic treatment is that of a potential which, in the vicinity of the saddle point, can be 
separated into two distinct contributions 



V(r)-V s = V l \(r l \) + V±(r 1 _) 



(11) 



where rn is the distance from the basin boundary (measured along the unstable manifold of the saddle), while the vector rj_ 
parametrizes all other directions in phase space. Finally, the zeroes of Vji and V± are set in the saddle point. Under the above 
assumptions, the only nonzero component of the flux is 



k B T 

7m 



■ exp 



V(r)\ dQ_ 

dr\\ 



knT 



(12) 



where we have introduced Q(r) = exp(V/ ksT)P(r) and J« depends only on ry . The vanishing of J± implies that Q depends 
only on rn . Accordingly, Eq. fl!2i can be solved to yield 



V||(5) 

e "b t d£ 



(13) 



where the integration constant is determined by imposing that Q(r) and, accordingly, P(r) vanish along the boundary. The 
multiplicative constant C can be finally determined by normalizing the integral of P(r): 



C = e~ 



e k B T d£ 



(14) 



The first integral is restricted to the basin of attraction of m a . In the small temperature limit, in the region where e k B T is 
significantly different from 0, the last integral is basically constant, so that we can replace its lower border with r a , thus writing 



C = c 



(15) 



where 



h = 








h = 


f 







k B T dr 



e k B T dt; 



(16) 
(17) 



The flux then writes 



^11 = 



AV 

knT e k B T 



-e 'n' 



im I a I\\ 

where AV = V s — V a . By integrating on the basin boundary, one finally obtains 

_ k B T I\ av 

T = — ^e k s T 

jm I a I» 



(18) 



(19) 
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where 



~dr ± (20) 

Eq. (1191 holds under the assumption of a sufficiently small temperature (AV > ksT) and the separability condition M li . The 
former hypothesis is needed to ensure a sufficiently slow flux to guarantee that a quasi-stationary approach holds and the integral 
in Eq. dl4t factorizes. The latter hypothesis ensures that Eq. dl3t indeed represents a meaningful solution of the problem. One 
should notice that separability is required to hold only in the region around the saddle where the integral I± has to be performed. 

If the temperature is small enough, only the leading quadratic terms are relevant in the computation of the integrals I a , Iu 
and I± . In this harmonic limit, they reduce to Gaussian integrals that can be computed by diagonalizing the Hamiltonian. Upon 
denoting with w, the 2N frequencies in the vicinity of the minimum m a ([wi*'] 2 = — A^/m, A a ^ being the negative i\h 
eigenvalue of the Hessian), with u>y the 27V — 1 frequencies around the saddle (those corresponding to the stable directions), 

and with u>u the rate associated to the only expanding direction ([w||] 2 = /m), one obtains the expression first derived by 
Langerin 1969 fid 0. 



nf '- w 



(21) 



where R := y IIj=i l-^-s 1/ Ili=i l-^o I, can be interpreted as an entropy factor II 8lll9ll . In the case of continuous symmetries 
(as, e.g., translational and rotational symmetries), Hessian eigenvalues corresponding to Goldstone modes vanish. They have to 
be excluded in the frequency products appearing in Eq. Mil | ljg]. 

Expression \2\\ is routinely employed in studies of many-body systems, including relaxation dynamics in glasses llal and in 
the estimation of entropy barriers in clusters 1 19]. However, its validity range has not been thoroughly investigated. For instance, 
in Ref. |20] a Master equation is constructed for a cluster of 19 atoms by identifying directly minima and saddles, but estimating 
transition probabilities only from an expression similar to Eq. (12 1 1 . 

In the weak damping limit, Eq. ( 12 li generalizes to |21] 



where the multiplicative correction 



T L = — l j,e , (22) 



C = , (23) 

l + v /l + (2^|/7) 2 



depends on the ratio between the damping constant and the divergence rate along the expanding direction. 

Finally, we mention a simplified formula for the escape rate that is somehow halfway between the general expression ( I19t and 
that one corresponding to the linearization (12 li . Under the hypothesis of a fully separable potential, one can factorize the two 
integrals I a and In into products of one-dimensional integrals 



2JV 



Ia = J] / e (24) 

i— 1 

2JV-1 „ V< i >(r< i )) 

i|| = Y[ c ^^dr^ (25) 



i=i 



where Vi is the zth component of the potential in the vicinity of the minimum with the zero of the scale such that Vi (0) = 0, and 
V"W is the analogous around the saddles for the stable directions. 

In the following section we compare Langer prediction J2 It with our numerical results. In Sect. V we discuss the limits 
of applicability of Eq. (12 1 i and we test the improved expression for the escape rate (1 1 9i (with the integrals estimated as in 
Eqs. d24l25l T). 



IV. NUMERICAL RESULTS 



In order to characterize the dynamics of the polymer in the native valley we have determined both analytically and numerically 
the escape rate from the NC towards any of the NNMs. The procedure for identifying the NNMs from a database of "inherent" 
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minima, constructed by following the method outlined in Sect. II, relies on the identification of the minimal energy path 
connecting each NNM to the NC. The algorithm used to find these paths is described in the Appendix: it allows identifying the 
saddle separating any two minima. 

A few pairs of NNMs turn out to be connected by more than one (up to 3) minimal energy path, this implying that they are 
separated by more than one saddle. In these cases, one should, in principle, compare the numerically determined escape rate with 
the sum of the probability flows Tl through the different saddles. As a matter of fact, we limited to consider the contribution 
of the saddle yielding the maximal flow. This approximation is definitely negligible with respect to the discrepancies between 
numerical and theoretical estimates fully discussed in the following. 

In Fig.[3]the three potential contributions V\, V2, and V3 are reported for a minimal energy path connecting the NC to one of 
the NNMs. A common feature to all the examined paths is that the main contribution to the potential energy barrier arises from 
the Lennard-Jones term. This confirms that the term driving the folding is indeed the long-range potential term mimicking the 
hydrophobicity effects (as already mentioned in (3). 




FIG. 3: Potential energy profiles versus the distance from the NC (measured along the unstable manifold of the saddle) rii. The three curves 
correspond to the different potential contributions for a minimal energy path connecting the NC to a nearest neighboring minimum for the 
sequence SI. The solid line indicates the Lennard-Jones contribution V3 0, while the dashed line the harmonic term V\ Q and the dot- 
dashed line the potential term V2 0. The last two terms are shifted along the vertical axis by a factor —5.2 and —5.75, respectively. 



The evaluation of Tl requires the knowledge of the eigenvalues of the Hessian in both the NC and a suitable saddle. In our 
model, because of translational and rotational symmetries, three eigenvalues always vanish. This is clearly seen in Fig.[4l where 

(i) (i) 

the frequency spectrum ui a of the NC is plotted for the three sequences (panel a), together with the spectrum u> ± of one saddle 

(i) 

(panel b). The uja spectrum decreases smoothly from values around 13 down to 0. It is interesting to notice that all spectra 
do not differ significantly from what one would obtain for a purely harmonic chain, in which case the spectrum would decrease 
from a maximum frequency equal to 2y/2a — 12.6 down to zero. It is only at lower frequencies that differences among the 
various sequences can be appreciated: in fact, this spectral band is basically determined by the angular motion that is primarily 
controlled by the cosine 10 and Lennard-Jones @ potentials. 

For what concerns the value of a>ii, it turns out to range between 0.3 and 1.8 in all saddles. 

Then, we have directly determined the escape rate T(j) from the NC by performing several USs with the damping constant 
set equal to 6.9. Every At time units, the "inherent" polymer configuration is determined by a steepest descent method. As soon 
as the polymer leaves the basin of attraction of the NC, the corresponding time is registered together with the new minimum that 
has been reached. Let us denote with Mj the number of USs ending in the j-th minimum, with (tj) the corresponding average 
escape time and with M — Y^jli Mj the total number of USs. We have verified that (tj) is independent of j. This indicates 
that the polymer spends most of the time before any jump in exploring the same region in the phase space, consistently with the 
hypothesis of a thermally activated process. Thus, the escape rate toward the j-th minimum can be numerically estimated as 

r « = ^ ■ < 26 > 

where we have dropped the dependence on j in the escape time. 

We have focused our numerical analysis in the temperature range [T g ,Tg], where the polymer spends most of the time in 
a collapsed state. In order to obtain a sufficient statistics the parameters of USs have to be suitably tuned according to the 
temperature values. In particular, the sampling time At has to be maintained sufficiently short to avoid back-crossings of the 
barrier and multiple jumps. More precisely, it must be smaller than the lifetime of all NNMs. Since all lifetimes decrease with 
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FIG. 4: a) Angular frequencies a>i associated to the Hessian of the NC for the sequence SO (squares), SI (circles) and S4(diamonds); b) 
angular frequencies uj^ associated to the Hessian of a saddle separating the NC from a NNM for the sequence SI. 



the temperature, increasingly smaller At's have to be adopted when the temperature is increased l22ll . We have chosen At- 
values ranging from 1 at low temperatures (T ~ T g ), to 10 -3 at high temperatures (T ~ Tg), while the integration time step has 
been kept equal to 10~ 3 (a few tests performed with an integration time step ~ 10~ 4 have not revealed any relevant difference). 
Anyway, at low temperatures the escape rate towards a subset of NNM is so small that in practice they are never found to be 
visited over an extremely large number of simulations. This is why in Table ||l]we indicate the number n v of visited NNM as 
a function of the tempareture: it turns out that all NNM are visited already at T = 0.08 for both SO and SI, while for S4 new 
minima are found up to T = 0.1 (the highest temperature we have examined). 

The numerically computed transition rates are presented in Fig. |5J where only the results for the most visited NNMs are 
reported (see the dashed lines). The solid lines refer to the theoretical estimates. We expect that Eq. i2l\ holds, since the chosen 
value of the damping coefficient, 7 = 6.9, should guarantee an overdamped dynamics. 

While at T ~ 0.04, the analytic expression Tl is in good agreement with the numerical estimates (for all of the three 
sequences), at higher temperatures the theoretical expression overestimates the escape rate. In order to perform a quantitative 
comparison, it is convenient to compute the ratio 

(27) 

3 m ' (27) 

for each escape process towards an NNM and to then average over all neighbouring minima 

3 

where Pj — T(j) j ^\ T(i) is the probability that an US ends up in the j'-th NNM and the sum is restricted to those minima that 
have been visited at least twice. The values of (r) for the three sequences at four different temperatures from 0.04 up to 0. 1 are 
reported in Tab. HP In this range, statistically reliable estimates are obtained already for M > 10 3 . In all cases, the theoretical 
formula is reasonably accurate ad low temperatures, but it downgrades when going above the "folding" temperatures and the 
phenomenon is more evident in the case of the good folder S 1 . 



V. DISCUSSION OF THE RESULTS 



In this section we discuss several independent factors that can a priori affect the observed escape rates. First of all, we have 
investigated whether intrinsic fluctuations due to the chaotic dynamics contribute significantly to the escape rates. In order 
to examine this point, we perfromed microcanonical simulations at various temperatures (namely T = 0.04, 0.06, and 0.08). 
Although the potential energy is larger than the barrier height, no jumps have been observed in simulations lasting up to 2 • 10 6 
time units. This means that local fluctuations are not strong enough to trigger jumps between neighbouring valleys in the 
presence of a global energy conservation. Therefore, fluctuations due to the coupling with independent heat baths are a crucial 
ingredient in establishibg the time scale of the escape rate. 

Yet the observed discrepancy between the analytic expression ( 12 II and the numerically evaluated escape rate calls for an 
explanation. Eq. (121 1 has been derived by making several assumpation that may not be fulfilled in practice: 
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FIG. 5: Transition rates F versus the barrier height AV: comparison of the theoretical expression <2H (solid line) with the numerical estimate 
1261 (dashed line) for two different temperatures (T = 0.04 and T = 0.06) for the three studied sequences. Namely, (a) SO; (b) SI and (c) S4. 
The numerical data refer to a total number M = 1, 000 — 5, 000 of USs and to a sampling time At = 1. 

1 . the value of the friction coefficient 7 should be larger than the frequency associated to the expanding direction of the 
saddle u>«, i.e. the dynamics should be overdamped; 

2. the configurational probability density P(r, t) should be almost stationary, i.e. the polymer should be well thermalized 
before a jump occurs; 

3. the potential energy should be well approximated by the quadratic contributions in the relevant regions around both the 
saddles and the NC; 

In the following we investigate the validity of these assumptions. 

A. Overdamped Limit 

Throughout this paper we have fixed 7 equal to 6.9 in adimensional units. In order to check whether this is a meaningful 
choice in the protein context, we must express the damping rate in physical units, 7 = 6.9m/r , where m is the mass of a 
typical aminoacid, while r Q is the period of small oscillations within the potential well. Since m ~ 10~ 22 g and r D ~ 10 -12 
s, it follows that 7 ~ g/s, a value to be compared with the typical relaxation rate due to collisions with water molecules 
jh 2 o — 10~ 8 — 10~ 9 g/s (see Ref. 112 3ll for a more detailed discussion). Our choice of 7 is, therefore, not too far from reality. 
Moreover, since 7 is already four times larger than the maximum wb, we expect the system dynamics to be in the overdamped 
regime. 

Anyhow, it is instructive to investigate whether the damping rate is responsible for the non perfect agreement between numer- 
ical data and the approximate theoretical expression. In order to clarify this point we performed further USs with both a smaller 
(7 = 1) and a larger (7 = 49) friction. Since 7, in the former case, is of the order of tvn , one should merely expect multiplicative 
corrections arising from the C/7 factor in Eq. i22\ . By neglecting saddle-to-saddle fluctuations of uiu (assumed always equal to 
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TABLE II: For all the sequences and four different temperatures T we report: the numerically estimated average escape time (r), the number 
of visited NNM n v (the numbers within brackets refer to the minima visited more than once) and the weighted ratio (r). 



T 


(r) 


n v 


<r> 


SO - 0.04 


6484 


20(16) 


1.37 


0.06 


265 


27(25) 


1.42 


0.08 


42 


31(30) 


1.76 


0.10 


19 


31(30) 


3.43 


SI - 0.04 


4375 


28(23) 


1.31 


0.06 


168 


32(27) 


1.95 


0.08 


24 


37(34) 


2.05 


0.10 


19 


37(36) 


3.89 


S4 - 0.04 


129 


12(11) 


1.30 


0.06 


58 


28(24) 


1.42 


0.08 
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27(25) 


2.16 


0.10 


18 


30(24) 


3.74 
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FIG. 6: Numerical transition rates T vs. the barrier height A V for two different 7-values. Filled circles refer to 7 = 1, while empty triangles 
correspond to 7 = 49. The latter have been scaled by a factor 49 x 0.56 (see text). Simulations refer to sequence SI at T — 0.07. The number 
ofUSs is M ~ 4,000. 

its average value 1.17), this amounts to a correction term 49 x 0.56 . In Fig.|5] we have multiplied by this factor the numerical 
data obtained for 7 = 49. The quite good overlap between the two sets of data confirms that the dependence on the damping 
term is well reproduced by the theoretical formula (part of the oscillations is of statistical nature and part is due to the neglected 
a>ii fluctuations). 

B. Thermalization time 

A fundamental hypothesis implicitely made in the derivation of the analytical expression for the escape rate is that the probabil- 
ity density of initial conditions inside the native valley can be approximated by the product of the Boltzmann-Gibbs distribution 
times a factor that differs significantly from 1 only in the neighborhood of the saddle. In order to verify the validity of this 
assumption, we have studied the decay of the autocorrelation functions for the kinetic and potential energy in several USs with 
SO and SI. 

From the data reported in Tab. 11111 it is evident that the typical correlation times are much smaller than the average escape 
times at all the examined temperatures. Since the correlation time is a reasonable estimate for the time required to reach a "local 
thermal equilibrium", these results suggest that, whatever the distribution of initial condition used for the USs is, the system 
thermalizes before escaping. In fact, USs performed by starting from different sets of initial conditions lead to close T values. 
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TABLE III: Decay times for the autocorrelation function of the kinetic energy tk and potential energy TV, for SO and SI at various temper- 
atures T. The times tk.v are estimated by assuming an exponential decay for the autocorrelation functions. The functions has been obtained 
by averaging over M = 10 different USs. 



T 
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TV - SO 


TK- SI 


TV ' SI 
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FIG. 7: Transition rates T versus the barrier height AV. Langers's expression <2U (dashed line) is compared with the numerical estimate 
for two choices of initial conditions: with (dot-dashed line) and without (continuous line) thermalization. Data refers to the sequence SI at 
temperature T = 0.07 with M — 1, 000 and a thermalization time T = 100. 

For instance, in Fig.Qone can compare the results of simulations started from a Maxwellian distribution of the velocities (dot- 
dashed line) with those obtained by gradually warming (with a 7 x 10~ 4 rate) an initially frozen configuration. The relative 
differences are much smaller than the deviations from the theoretical expectation (dashed line). 

However, Tab.lIIIIbrings forth some interpretative problems: while the correlation time of the kinetic energy does not depend 
on T and is proportional to the friction coefficient, the correlation time of the potential energy V(t) = V\(t) + Vt(t) + V^i) 
decreases with temperature. This phenomenon can be directly observed in Fig. |8j where the absolute value of C(t) = 
(V(t)V(t + t)) is plotted for different temperatures. The partial slowing down is an entirely nonlinear effect, since no temper- 
ature dependence can arise in a purely harmonic potential. In the next subsections we shall see that nonlinearities are indeed at 
the origin of the limited validity of the theoretical formula. 

C. Role of nonlinearities 

Langer's estimate assumes that the potential is harmonic in the vicinity of both the NC and the saddle. In order to test whether 
nonlinear corrections may be important, we have estimated expression dl9l under the hypothesis of a fully separable potential. In 
fact, from the products of the one-dimensional integrals in Eqs. (I24I25K one can at least establish whether nonlinear corrections 
are truly important. In practice, we have evaluated the integrals along the eigendirections of the Hessian in the NC and in the 

corresponding saddle. The integration interval for the i-th eigendirection is set equal to [— r* , r*], where r* = 3\J 2-KT/u) a 1 ^ for 

I a and analogously r * = 3 y 2ttT '/tJf* for I± . 

The comparison of this expression with both numerical results and the standard Langer's formula is presented in Fig.|9]for 
the sequence SO. Although there is no reason to expect the potential to be separable, it is interesting to notice that the refined 
theoretical expression improves over Langer's formula. On the other hand, the remaining sizeable deviations from the numerical 
results indicate the need of a really improved theoretical formula. 

The derivation of Langer's formula reveals that an activation process can be decomposed into an Arrenhius factor, controlled 
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FIG. 8: Initial decay of the absolute value of the autocorrelation function C(r) for the potential energy during canonical simulations at 
T = 0.02 (continuous line), T = 0.04 (dashed line), T = 0.06 (dot dashed line),T = 0.08 (dotted line). For each temperature C(r) is 
averaged over a time span of 50 and over 10 different trajectories. 
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FIG. 9: Transition rate Y versus the barrier height: comparison of Langer's expression 12 H (solid line) with numerical estimate I26i (dashed 
line) and with expression i 1 91 equipped with Eqs. 1241251 (dot-dashed line) for SO at T=0.1. Numerical estimates have been obtained with a 
sampling time At — 0.03 and performing N= 3000 USs. 

by the energy difference AV, and an entropic factor R. The analysis of the latter one conveys useful information on the structure 
of the potential energy landscape and helps sheding some light on the above mentioned discrepancies. While, a priori, there is 
no reason to expect R to be either smaller or larger than 1, in practice, it is almost always smaller than 1 (see Fig.H0>: the only 
exceptions are four saddles all around the energy minimum of S4, which are also characterized by extremely low barriers. In 
fact, such saddles are quite peculiar in that they almost coincide with the NC except for the first 3-4 beads, which are, on the 
other hand, relatively distant from the core of the configuration |3- Similar distributions of i?-values have been also found in 
small clusters of particles interacting through Lennard- Jones potentials |j l8lfl9ll . 

The presence of such large entropic factors accounts for the peculiar dependence of the escape rate on AV for S4. The 
abrupt drop of T when AV^ is decreased below 0.2 (see Fig.[5}is due the smallness of the entropic contribution (T is inversely 
proportional to R). This interpretation is further confirmed by the slow dependence of V on the temperature for such saddles. 

Leaving aside this peculiarity, there is an average tendency of R to decrease upon increasing the barrier height in both SO and 
S4; this indicates that higher barriers correspond to flatter saddles and thereby can be more easily overcome. This is not the case 
of the good folder, where R does not show any clear trend and is always bounded in the interval [10 -2 , lO^ 1 ]. Accordingly, all 
the NNMs are entropically equivalent and the escape rate is essentially determined by the Arrhenius factor. 

The most interesting observation can be, however, made by parametrizing the relative error ej (see Eq. 1271 ) of the escape 
rate Tj towards the jth NNM. From Fig. ^2 one can see that the deviation of Langer's formula becomes systematically larger 
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FIG. 10: Entropy ratios R versus the barrier height AV, for the three studied sequences: (a) SO , (b) SI and (c) S4. 

upon decreasing R. This is qualitatively understandable, since a small R implies a flat saddle; therefore it is reasonable to expect 
nonlinear corrections to be more relevant. What is less obvious is the observed dependence of ej on R. The data reported in 
Fig.llllreveal that ej » a/R, which amounts to conjecturing that T ~ 1/(R + a) with a basically independent of j. This is 
quite a remarkable result considering that the fit is more convicing at larger temperatures, when values as large as 60 of ej are 
observed. 



VI. CONCLUDING REMARKS 

We have studied in detail the unfolding dynamics of short homo- and heteropolymeric chains made of iV-beads in two dimen- 
sions. Polymers have been simulated via an off-lattice model previously introduced |8] and their evolution has been examined 
within the canonical ensemble. Our results suggest that the dynamics of polymers within their native valley can be described 
as a thermally activated process in a 2iV-dimensional space in a whole range of temperatures above and below the folding 
temperature. 

As a matter of fact, Langer's estimate for the escape rate represents a good approximation for all the examined sequences at 
low temperatures. We have verified that discrepancies between Langer's estimate and numerical data are mainly due to the poor 
approximation (limited to the harmonic terms) of the potential around the stationary points. A better estimate can be derived by 
taking into account higher order terms in the expansion of the potential. Since the folding temperatures for the homopolymer 
(SO) and the bad folder (S4) are relatively low, for these sequences the dynamics within the native valley can be reproduced 
reasonably well already with Langer's approximation. While for the good folder (SI), this is not the case, thus suggesting a 
more relevant role of nonlinearities at the folding temperature. In any case all the examined estimates turn out to be upper 
bounds for the escape rates. 

An analysis of the entropic contribution to the escape rate suggests that the folding behaviour of a sequence can be related to 
topological properties of the landscape around the NC. For bad folders higher energy barriers are associated to flatter saddles, 
thus favouring jumps towards more unfolded configurations. On the other hand, for the good folder the entropy ratio seems not 
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FIG. 11: The relative error ej versus the entropic factor R for SO (circles), SI (diamonds) and S4 (crosses). The data have been obtained by 
sampling a trajectory every 0.3 time units at a temperature T = 0.1. The dashed line is a guide for the eye and it corresponds to a slope one. 



to be related to the heigth of barriers. 

We would like to stress that our analysis amounts to explore the free energy landscape of a polymer, since in the jumping 
rate estimates are included not only Kramers' terms but also entropic contributions. The relevance of the latters in determining 
equilibrium and kinetic properties of peptides has been recently pointed out in 

In order to further explore the role of activation processes for the complete folding dynamics we plan to extend our analysis 
to the whole energy landscape. A complete graph describing all the folding/unfolding paths with their associated probabilities 
will allow to determine equilibrium properties of the system and possibly to distinguish bad and good folders. 
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APPENDIX A: ALGORITHM FOR THE IDENTIFICATION OF THE SADDLES 



The algorithm described in this appendix (see also |25] ) aims at determining the path of minimal potential energy connecting 
two minima indicated as ai and &2- 

Due to the symmetries of the potentials defined in Section II (see eq. (0 ), each spatial configuration of the polymeric chain 
is defined apart from a translation, a reflection and a rotation around an axis perpendicular to the xy-plane. Accordingly, after 
having expressed ai in an arbitrary reference frame, it is convenient to determine the coordinates of a.2 are by minimizing its 
Euclidean distance from ai with respect to the above mentioned symmetry transformations. 

The algorithm then consists in evolving a suitably chosen path connecting ai and a.2 according to a gradient dynamics until 
the maximum of the energy along the path converges to a minimum corresponding to a saddle. More precisely, the approach is 
split into three steps: 

1 . Choice of the initial configurations The initial path Co connecting ai and a 2 is generally chosen by linearly interpolating 
between their coordinates, 

x(i) = x 1 (i) + r(x 2 (i) - x 1 ^)) 
y(i) = y 1 (i) + r(y 2 (i)~y 1 (i)) 

i = l,...,N . (Al) 
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The sequence of initial configurations along the path Co is fixed by varying the parameter r between and 1 (we typically 
choose r = to/100, < m < 100). 

2. Evolution of the configurations Each configuration is then let evolve according to the gradient dynamics 

1 dH 

7 OXi 
1 dH 

m = . (A2) 

In practice, the damping coefficient 7 can be chosen equal to one, since it only determines the evolution time scale. 
The integration time step St is adapted to the instantaneous value of the force field, St — min(.01, .01/F max ), where 
F m ax = maxi { I I , |/f |} while ff and ff are the x and y components of the force acting on the ith bead. 

3. Interpolation phase After letting evolve the system for a time i = lOSt, the Euclidean distance A m (t) between the TOth 
and the (m + l)st point is computed. If A m (t) > 2A m (0), a new configuration, is added between the two points by 
linearly interpolating between them. If A m (t) < A m (0)/2 the (to + l)-st configuration is removed. In this way, we are 
able to work with a set of uniformly distributed configurations, without loosing resolution in the regions where the energy 
gradient is large. 

The last two steps are repeated until F max in the maximum energy configuration along the path becomes smaller than a 
fixed threshold typically chosen equal to 10~ 3 . The coordinates of the saddle point are finally refined by implementing a 
standard Newton scheme. 

On one hand, the path connecting two generic minima can exhibit more than one relative energy maximum (see, e.g., Fig.1121: 
this is an indication that the two minima are not nearest neighbours. In this case, our approach allows identifying new minima. 
On the other hand, it can happen that neighbouring minima are separated by more than one saddle: upon choosing different 
initial paths, one can, in principle, identify all saddles. Our simulations suggest that multiple saddles are not very common at 
least in the vicinity of the NC. 

-4.1, 1 1 1 — 



-4.3 



V 



-4.5 




FIG. 12: Potential energy profile V connecting the NC ai to a 2NNM a3 via 2 saddles for the sequence SI, the symbol &2 indicates a NNM. 
The index rii refers to the distance measured along the unstable manifold connecting the minima. 
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